-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libvirt: Enable multiple PodVM image scenario #2061
libvirt: Enable multiple PodVM image scenario #2061
Conversation
I think this need to be draft until #2036 which bumps us to the 3.9.0 version of the kata runtime is merged? |
daa9fea
to
d1df0da
Compare
I missed the kata-containers PR, do we expect other providers to use the same annotations for their podvm images? e.g. in AWS, will |
Hi Snir, |
077b833
to
4f00cce
Compare
e2e test results:
|
4f00cce
to
b18e30a
Compare
@@ -406,6 +406,32 @@ func DoTestPodVMwithAnnotationsLargerCPU(t *testing.T, e env.Environment, assert | |||
NewTestCase(t, e, "PodVMwithAnnotationsLargerCPU", assert, "Failed to Create PodVM with Annotations Larger CPU").WithPod(pod).WithInstanceTypes(testInstanceTypes).WithCustomPodState(v1.PodPending).Run() | |||
} | |||
|
|||
func DoTestCreatePeerPodContainerWithAlternateImage(t *testing.T, e env.Environment, assert CloudAssert) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the negative test is a good start to check that the code path works. I'm wondering if we could use the provisioner.UploadPodvm
logic to upload another podvmImage if a test image is provided (I guess the same image with a different name might be the best we can hope for) and then switch the annotation to try using that. I guess the problem then becomes how we check that the alternate podvm is being used might be tricky?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, that's true.. but if checking the CAA logs and ensuring that the PodVM picked up is the alternate one as per the annotation sounds good, then I guess we can try that option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - the CAA logs would be a good place to check - I didn't think about that. I'd say it's probably worth spending a bit of time to investigate this approach at least then
b18e30a
to
f451cd9
Compare
Latest e2e test results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few refactor suggestions, but otherwise it looks good. Thanks!
var podlist v1.PodList | ||
var podLogString string | ||
expectedSuccessMessage := "Choosing " + tc.alternateImageName | ||
if err := client.Resources("confidential-containers-system").List(ctx, &podlist); err != nil { | ||
t.Fatal(err) | ||
} | ||
for _, pod := range podlist.Items { | ||
if pod.Labels["app"] == "cloud-api-adaptor" { | ||
podLogString, _ = GetPodLog(ctx, client, pod) | ||
break | ||
} | ||
} | ||
if strings.Contains(podLogString, expectedSuccessMessage) { | ||
t.Logf("PodVM was brought up using the alternate PodVM image %s", tc.alternateImageName) | ||
} else { | ||
t.Logf("cloud-api-adaptor pod logs: %s", podLogString) | ||
yamlData, err := yaml.Marshal(tc.pod.Status) | ||
if err != nil { | ||
log.Errorf("Error marshaling pod.Status to JSON: %s", err.Error()) | ||
} else { | ||
t.Logf("Current Pod State: %v", string(yamlData)) | ||
} | ||
t.Fatal("failed due to unknown reason") | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional request: Could we move this section to assessment_helper.go as a new function? I'm trying to reduce how massive the assessment_runner has got with this strategy in some WIP commits I'm working on on the side.
@@ -68,6 +69,9 @@ func NewLibvirtProvisioner(properties map[string]string) (pv.CloudProvisioner, e | |||
vol_name = properties["libvirt_vol_name"] | |||
} | |||
|
|||
// alternate_vol_name is used for multi-podvm testing. | |||
alternate_vol_name := "another-podvm-base.qcow2" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a possibility to extract this as a constant, so we can reference it in libvirt_test rather than having to copy the value there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I tried to do this way, but unfortunately common.go
inside the e2e
folder is not extending to this provision_common.go
, also thought about adding it to common.go
under provisioner
, which is again not referenced on libvirt_test
/common_suite
under e2e. if it sounds fair to import, i can try this out..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I see. One approach we could take (which I'm not sure is even good) is create the const in libvirt/provision_common.go
and then we can reference it in libvirt_test.go
and pass it in as an argument to the common_test.go version? Otherwise I think we can leave it for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Steve, this sounds like an idea, will test it out..
@@ -135,6 +136,11 @@ func (tc *TestCase) WithInstanceTypes(testInstanceTypes InstanceValidatorFunctio | |||
return tc | |||
} | |||
|
|||
func (tc *TestCase) WithMultiplePodVM(alternateImageName string) *TestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if WithMultiplePodVM
is the best name here? Maybe something like WithAlternateImage
would be clearer?
fde68ba
to
5f5ef7e
Compare
@stevenhorsman Would it be fair to add |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of refactor things to look into.
@@ -525,3 +525,31 @@ func AddImagePullSecretToDefaultServiceAccount(ctx context.Context, client klien | |||
} | |||
return nil | |||
} | |||
|
|||
func VerifyCloudAPIAdaptorLogs(ctx context.Context, client klient.Client, t *testing.T, pod *v1.Pod, expectedSuccessMessage string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wasn't super clear with the refactor suggestion, I was thinking of including the "Choosing ..." logic in the helper method as well. I think you've done too good a job and the generic VerifyCloudAPIAdaptorLogs
is a great idea, so can you review other methods to see if it can be used elsewhere e.g. IsPulledWithNydusSnapshotter
please.
but it ComparePodLogString
if err := client.Resources("confidential-containers-system").List(ctx, &podlist); err != nil { | ||
return err | ||
} | ||
for _, pod := range podlist.Items { | ||
if pod.Labels["app"] == "cloud-api-adaptor" { | ||
podLogString, _ = GetPodLog(ctx, client, pod) | ||
break | ||
} | ||
} | ||
if strings.Contains(podLogString, expectedSuccessMessage) { | ||
t.Logf("cloud-api-adaptor logs are matching with the expected message") | ||
} else { | ||
yamlData, err := yaml.Marshal(pod.Status) | ||
if err != nil { | ||
t.Fatalf("Error marshaling pod.Status to JSON: %s", err.Error()) | ||
} else { | ||
t.Logf("Current Pod State: %v", string(yamlData)) | ||
t.Fatal("failed due to unknown reason") | ||
} | ||
return errors.New("failed due to unknown reason") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from the pod app being hard-coded (which might be enough of an issue to make this not re-usable), this looks pretty similar to ComparePodLogString
, so I wonder if you could call ComparePodLogString
from within this to reduce duplication?
5f5ef7e
to
5634221
Compare
@@ -203,11 +203,15 @@ func (s *cloudService) CreateVM(ctx context.Context, req *pb.CreateVMRequest) (r | |||
// Get Pod VM cpu and memory from annotations | |||
vcpus, memory := util.GetCPUAndMemoryFromAnnotation(req.Annotations) | |||
|
|||
// Get Pod VM image from annotations | |||
image := util.GetImageFromAnnotation(req.Annotations) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the generic changes be in a separate commit to distinguish from provider specific implementation ?
5634221
to
07b98ae
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good, we just need to wait until the libvirt e2e tests are back and confirm the tests work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I have no means to test this.
@ajaypvictor can you rebase this PR please and hopefully the tests should run properly then. Thanks! |
bc5cf2a
to
c134ef1
Compare
It looks like in the e2e testing |
The job test is passing locally for me:
|
I've just spotted that the CI e2e test show failure logs for the deletion-test of:
which doesn't seem right... I've confirmed locally that running |
@ajaypvictor - I've been able to reproduce this issue by doing Here is the caa log for the relevant section:
Hopefully this is enough info to give you a path to debug 🤞 but we can catch-up tomorrow |
Thanks Steve for figuring this out and apologies for the miss from myside.. |
c134ef1
to
4cebdcc
Compare
Enable support for multiple PodVM images for cloud providers with the merge of kata-containers/kata-containers#10274 Fixes confidential-containers#1282 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
Enable support for multiple PodVM images for libvirt provider Fixes confidential-containers#1282 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
Tests for multiple PodVM images with an invalid non existing volume and a copy of the valid volume with a different name Fixes confidential-containers#1282 Signed-off-by: Ajay Victor <ajvictor@in.ibm.com>
4cebdcc
to
589382d
Compare
I think the edge case was that the CAA would continue to use the annotation set previously rather than switching to default for the next podvm, fixed this case.
|
Enable support for multiple PodVM images for libvirt provider with the merge of kata-containers/kata-containers#10274
Fixes #1282
Working Scenario:
Tried with the following busybox manifest:
Here the
pppvm-nov-vol.qcow2
is a valid libvirt volume and a podvm image is uploaded there.CAA logs:
Failure Scenario:
Tried with the following busybox manifest:
Here the
rhel9-os
is a dummy variable and not an actual libvirt volume.CAA logs: